In this investigation, I wanted to shed the light on the characteristics of trip data that could be used to predict their duration. The main focus was on the : age, distance user type, and gender.
The data set includes information about individual rides made in a bike-sharing system covering the greater San Francisco Bay area, the data set contains approximately 183,412 records and 16 feature.
Trip Durations in the dataset take on a very large range of values. Number of Trips values first increases starting from around 1400 values to 12500 values at peak around 350 seconds but then starts to fall below at 2000 values.
The age values are condensed between 20 and 40 years.
The chart below shows that the most frequent users aged between 20 and 45
The main thing I want to explore in this part of the analysis is how the three categorical measures of gender into the relationship between trip duration and age.
For the age, duration, and user type, both Customer and Subscriber are showing similar trends for age and trip duration, but for subscribers the trip duration is higher for older age.